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.ABSTRACT 

We  present  the  results  of  a  Monte  Carlo  Study  of  the  power 
of  a  new  multivariate  goodness -of- fit  test. 


INTRODUCTION 

Foutz  [1]  developed  a  new  test  of  goodness -of- fit  for  multi- 
variate distributions.  The  test  can  also  be  used  to  fit  univariate 
distributions;  in  an  earlier  paper  [2]  the  authors  compared  the 
Foutz  test  with  the  Chi -square  and  Kolmogorov-Sinirnov  test.  The 
results  indicated  that  the  Foutz  test  is  more  powerful  in  detecting 
certain  characteristics  than  the  other  two  tests.   This  paper  deals 
with  the  performance  of  the  test  when  fitting  multivariate  distri- 
butions. More  specifically  we  investigate  the  power  of  the  test 
when  fitting  bivariate  and  trivariate  normal  distributions  for 
various  choices  of  the  mean  vector  and  the  covariance  matrix.   In 
the  second  section  we  present  a  brief  description  of  the  Fout: 
test;  a  discussion  of  the  simulation  procedure  is  in  the  third  sec- 
tion and  the  results  of  the  simulation  are  in  the  final  section. 


THE  FOUTZ  TEST 

Let  X, ,  X-,...,X  ,  be  a  random  sample  of  size  n-1  from  a 
p-variate  continuous  distribution.  The  first  step  of  the  Foutz 
test  is  to  divide  the  sample  space  into  n  statistically  equivalent 
blocks  8-.,8-, , .  . .  ,8  and  then  determine  a  "continuous  empirical 
distribution  function  (CEDF)"  H  .  The  test  statistic  for  the  hypo- 
thesis that  the  true  c-d-f  is  H  is 

(1) 


p   .  Sup 
n   all  events  S 


Pn(3)  -  PH(8) 


where  P  (8)  and  Pu(8)  are  the  empirical  and  hypothesized  probabi- 
lity measures  of  an  event  8,  computed  from  H  and  H  respectively. 
An  equivalent  computational  formula  for  F  is 

n         1 

F   =  T   max  [0,-  -  D.] 

n    >,      L  '  n    iJ 

where  D.  =  P[X  e   8- |H]. 

A  general  procedure  for  the  construction  of  statistically  equi- 
valent blocks  is  the  following.  Select  n-1  "cutting  functions" 
hifX) ,  k  =  1,  2,..., n-1,  such  that  h,  00  has  a  continuous  distri- 
bution, and  a  permutation  k, ,k-,...,k  ,  of  the  integers  1,2,..., n-1. 
Order  the  samples  X-  according  to  the  value  of  h,  (X-);  let  X(k  )  be 

1  K-.  — l         —   1 

the  vector  associated  with  the  k, th  order  statistic.  Partition  the 
sample  space  into  two  blocks  B,  and  B~  defined  by 

BT={X:  hk  (X)  <  hk  [XCkp]}  and  %  =  ^   • 

At  the  second  step,  if  k-  <  k,  ,  order  the  k,  sample  vectors  in  3, 
according  to  h,  (X)  and  let  X(k9)  be  the  k0th  order  statistic.  Par- 

tition  B,  into  two  sub-blocks  B, ,  and  B  •  at  this  stage  the  sample 
space  is  partitioned  into  three  blocks  B, ,,  B..  -  and  B~,n  =  B-,.   If 
k-,  >  kj ,  order  the  n  -  1  -  k-.  sample  values  in  the  block  B-, 
according  to  h^  (X) .   Let  X(k0)  be  the  (k7  -  k,)th  ^rder  statistic 


and  partition  the  block  B?  into  two  sub-blocks  B0,  and  B.,-,;  take 
B-.^  =  B,  .  Continue  the  process  until  all  the  cutting  functions  are 
exhausted;  this  results  in  a  partition  of  the  sample  space  into  n 
statistically  equivalent  blocks  S-,,  37,...,  S  .  More  details  on 
the  procedure  and  some  examples  are  available  in  [3], 

The  null  c-d-f  of  the  test  statistic  F  (necessary  to  deter - 

n  *• 

mine  the  critical  values)  is  quite  difficult  to  derive  even  for 
small  n;  for  n  =  3,  4,  5  formulas  for  the  exact  c-d-f  are  in  [2]. 
Foutz  proposed  a  large  sample  normal  approximation  with  mean 
y  =  e   and  a"  =  (2e   -  5e~^)/n.   In  our  earlier  study  [2]  we  found 
that  with  this  approximation  the  observed  significance  level  is 
about  10-20%  smaller  than  the  nominal  values  for  sample  size  n  -  1  = 
20,  30,  50.  We  therefore  proposed  the  use  of  empirically  generated 
(based  on  80,000  simulated  F  values)  critical  points  in  Table  I 
below. 

TABLE  I.  EMPIRICAL  CRITICAL  VALUES  FOR  FOUTZ  TEST 


Sample 

Si: 

:e 

20 

30 

50 

Significance  level 

.10 

.42714 

.41903 

.40816 

.05 

.44865 

.43535 

.42116 

.01 

.48659 

.46579 

.44487 

We  have  also  generated  a  variation  of  the  large  sample  normal 
approximation  to  calculate  more  accurate  estimates  of  the  critical 
values.  We  are  presently  in  the  process  of  using  this  approxima- 
tion to  generate  tables  for  a  spectrum  of  critical  values  and 
sample  sizes. 

SIMULATION  PROCEDURE 

To  investigate  the  power  of  the  Foutz  test  5,000  replicates 
each  of  samples  of  size  20  from  several  bivariate  and  trivariate 


normal  distributions  were  generated;  in  all  cases  the  hypothesis 
tested  was  that  the  samples  are  from  a  bivariate/trivariate  normal 
distribution  with  zero  mean  vector  and  covariance  matrix  the  Iden- 
tity. The  true  values  of  the  means,  variances  and  covariances  for 
the  generated  samples  were  chosen  so  as  to  study  the  effect  of 
(i)  shifts  in  the  means  only  (ii)  shifts  in  the  variances  only 
(iii)  shifts  in  covariances  only  and  few  cases  involving  a  combina- 
tion of  all  three. 

The  method  of  blocking  we  implemented  was  as  follows.  We  let 
the  samples  themselves  implicitly  determine  the  permutation  k, ,  k9, 
...,  k  ,  and  the  order  of  the  cutting  functions  which  were  all 
taken  to  be  coordinate  functions  i.e.,  h.  GO  =  X/1J  the  i   coordi- 
nate of  the  sample  vector  X,  for  some  i.  The  following  example 
with  p  =  2  (bivariate  samples)  will  illustrate  the  procedure.  Sup- 
pose x,,  x?,  ...,  x  ,  are  the  observed  sample  vectors.  The  first 
cut  on  the  p-dimensional  sample  space  is  made  at  x,   the  first 
coordinate  of  the  first  sample  vector  x, .  This  partitions  the  sam- 
ple space  into  two  blocks  B, ,  B0  defined  by 

1   coordinate     2   coordinate 


1  *■  '  -1 


X-i^l  f-oo   +oo) 


> 


B2  (x|  '  ,  +c°)  (-o°,  +0°) 

The  second  sample  vector  x~  will  be  in  one  of  the  two  blocks 
B,  or  37.  .Assume  that  it  is  in  B.,;  partition  B7  into  two  sub-blocks 

1   coordinate      2   coordinate 

B21         u|1},~)  (-,  X22)l 

B22  U[l\   -0  (xj2),  +-) 


(?) 
where  x^   is  the  value  of  the  second  coordinate  of  the  sample 

vector  x?.  At  this  stage  the  sample  space  is  partitioned  into 
three  blocks  B   B?,  and  B7?.  Continuing  this  process  (letting 
the  cutting  function  at  the  q   stage  to  be  x  rJ  the  r   coordinate 
of  the  x  where  r  =  [  (q-1)  mod  p  +  1]   )  until  all  the  sample  vec- 
tors  are  exhausted  will  result  in  a  partition  of  the  sample  space 
into  n  statistically  equivalent  blocks.   In  our  simulation  we 
used  both  rectangular  and  polar/ spherical  coordinates  to  examine 
if  one  scheme  is  more  adept  at  detecting  certain  types  of  violations 
of  the  null  hypothesis.  When  using  spherical  coordinates  the  first 
coordinate  is  taken  to  be  the  radius  vector  (same  for  polar  coordi- 
nates), the  second  coordinate  is  the  angle  in  the  horizontal  plane 
and  the  third  coordinate  is  the  angle  measured  from  the  vertical  axis 
Figures  1  and  2  graohically  demonstrate  the  cons truct ion  of  the 
blocks  for  five  biviriate  samples  using  the  rectangular  and  polar 
coordinate  systems.  The  implicit  permutation  and  the  order  of  the 
coordinate  cutting  /unctions  are  also  included  in  the  figures.   It 
should  be  noted  thai  for  polar/ spherical  coordinates  it  is  necessary 
to  make  an  additional  initial  cut,  which  we  took  at  6  =  0°. 

The  probability  contents  of  the  statistically  equivalent  blocks 
under  the  null  hypothesis  (multivariate  normal  with  Zero  mean  vector 
and  covariance  matrix  the  Identity)  are  easily  computable  as  products 
of  univariate  probabilities  (normal,  Chi  square  and  uniform)  for  the 
rectangular  coordinate  system  as  well  as  the  polar/ spherical  coordi- 
nate system. 

RESULTS 
For  ease  of  comprehension,  the  results  for  shifts  of  mean  or 
variance/covariance  are  given  as  power  curves  in  Figures  3-7. 
All  power  curves  are-  based  on  5,000  replications  at  the  .05  signi- 
ficance level.  The  results  are  indicative  of  those  obtained  for 
significance  levels  of  .01  and  . 1.  Full  details  are  available  in 


Linhart  [3] . 

Shifts  in  mean  are  detected  well.  Figures  3  and  4  show  that  a 
shift  of  one  standard  deviation  in  the  mean  results  in  about  a  60% 
rejection  rate  for  both  bivariate  and  trivariate  samples.  The 
rectangular  method  of  blocking  consistently  resulted  in  a  higher 
rejection  rate  than  did  the  polar/ spherical  method. 

Power  curves  for  shifts  in  variance  are  given  in  Figures  5  and 

6.  Small  shifts  in  variance  are  not  detected  very  well,  but  larger 
shifts  and  shifts  in  more  than  one  component  resulted  in  higher 
rejection  rates.  Neither  blocking  method  produced  rejection  rates 
significantly  better  than  the  other  except  in  the  trivariate  case 
when  one  variance  was  shifted.  In  general,  the  polar/ spherical 
method  of  blocking  gives  slightly  higher  rejection  rates,  but  not 
consistently. 

The  power  curves  for  shifts  in  covariance  are  given  in  Figure 

7.  Except  for  highly  correlated  data,  neither  blocking  scheme 
detects  these  shifts  very  well.  The  polar/ spherical  method  of 
blocking  usually  gives  somewhat  higher  rejection  rates  than  the 
rectangular  method. 

To  test  for  possible  confounding  in  detection  of  shifts  in 
both  mean  and  variance/ covariance,  we  made  the  series  of  runs 
listed  in  Tables  II  and  III.  Shifts  are  generally  of  increasing 
amount  as  one  looks  down  and  to  the  right  in  the  tables.  It  is 
also  true  that  rejection  rates  increase  as  one  looks  down  and  to 
the  right,  indicating  there  is  no  detectable  confounding.  The 
rectangular  method  of  blocking  generally  gave  better  results  with 
multiple  shifts. 

All  of  the  above  results  are  based  on  a  sample  size  of  20  and 
significance  level  of  .05.  To  determine  possible  trends  over  sample 
size  or  significance  level  we  made  the  series  of  runs  given  in 
Tables  IV  and  V.  Larger  sample  sizes  generally  led  to  higher  rejec- 
tion rates,  as  is  to  be  anticipated.  Violations  of  this  did  occur 


when  rejection  rates  were  close  to  the  significance  level,  i.e., 
the  shift  was  not  detected  very  well.  The  rectangular  blocking 
method  generally  gave  higher  rejection  rates. 

In  conclusion,  the  Foutz  test  is  adept  at  detecting  shifts  in 
mean,  less  powerful  at  detecting  shifts  in  variance,  and  poor  in 
detecting  correlated  variates  unless  they  are  highly  correlated. 
In  addition,  shifts  in  mean  are  not  disguised  when  a  shift  in 
variance/covariance  is  also  present. 
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FIGURE  3:   POWER  CURVES  FOR  SHIFTS  IN  MEAN 
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TABLE  II 

REJECTION  RATES  FOR  MULTIPLE  SHIFTS  IN 
MEAN  .AND  VARIANCE  -COVARIANCE 

a  =  .05,  n  =  21* 


Sigma 


Mean 


(8) 

.0488 
.0482 

.0912 

.1176 

.1986 
.1522 

.2500 
.2162 

.9658 

.9572 

(■» 

.1710 
.1388 

.2398 
.2482 

.3110 
.2498 

.3702 
.3402 

.9720 
.9650 

(I) 

.5606 
.4348 

.7346 

.5952 

.6384 
.5334 

.6828 
.6316 

.9820 
.9764 

(I) 

.9418 
.8576 

.8774 
.8588 

.9350 

.8722 

.8658 
.8308 

.9892 
.9840 

(I) 

.9998 
.9950 

.9998 
.9990 

.9902 

.9772 

.9950 
.9882 

.9990 
.  9964 

*  First  Entry  -  Rectangular  coordinate  blocking 
Second  Entry  -  Polar  coordinate  blocking 
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Mean 


TABLE  III 

REJECTION  RATES  FOR  MULTIPLE  SHIFTS  IN 
MEAN  AND  VARIANCE -COVARIANCE 


a  =  .05,  n  =  21 


=  71  * 


Sigma 


0440 
0518 


0674 
0582 


,5392 
.4584 


"828 
7840 


9770 

9720 


0480 
0280 


1830 
1176 


5708 
5034 


7946 
8020 


9832 
9740 


3728 
1174 


,6352 
,2912 


6852 
6254 


8206 
8422 


9888 
9824 


7400 

7392 


9074 
7454 


9270 
8602 


,9668 

,9454 


9930 
9870 


9982 
9726 


9956 

9752 


9716 

9742 


9736 

9774 


9978 
9976 


First  Entry  -  Rectangular  coordinate  blocking 
Second  Entry  -  Polar  coordinate  blocking 
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TABLE  IV 
REJECTION  RATES  FOR  INCREASING  SAMPLE  SIZES 
Sample  size  (n-1)         20  30  50 


Shift 

(1) 


u  = 


0 
u  =  /.5 


Z   = 


z   = 


(.-  i) 

a  s) 


=  .01 


0574 

.0860 

.1270 

0430 

.0564 

.0754 

1294 

.2026 

.3652 

1230 

.1418 

.2508 

0126 

.0140 

.0176 

0136 

.0152 

.0170 

1864 

.2722 

.4522 

1578 

.2244 

.3744 

u  = 


(i) 


(J  i) 


z  =  / 1   o 


0 


0 


=  .05 


1710 

.2170 

.2914 

1388 

.1630 

.2238 

3024 

.4144 

.6030 

3164 

.3076 

.4826 

0576 

.0624 

.0728 

0656 

.0624 

.0760 

3786 

.4884 

.6756 

3342 

.4304 

.6016 

a  =  .10 

2700 

.3228 

.4256 

2298 

.2658 

.5400 

4406 

.5424 

.7190 

4554 

.4356 

.6066 

1173 

.1174 

.1396 

1258 

.1196 

.1422 

5150 

.6132 

.7800 

4640 

.5566 

.7160 

u  = 


(•I) 


r  = 


(j  i) 

(J  I) 


First  Entry  -  Rectangular  coordinate  blocking 
Second  Entry  -  Polar  coordinate  blocking 
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TABLE  V 
REJECTION  RATES  POP   INCREASING  SAMPLE  SIZES 
Sample  size   (n-1)  20  30  50 


Shift 


u  = 


.0 

\ 

0 

0  j 

/ 

.5  N 

I 

.5 

.5/ 

r 

1 

0 

.3 

0 

1 

0 

.3 

0 

1 

3 

0 

0 

0 

1 

0 

0 

0 

1 

a  =  .01 


U  =  /.5  \  .0480         .0680        .1036 

.0280        .0362        .0526 


,1738  .2932  .5040 

,0848  .1662  .3428 
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